Impact of distance-based metric learning on classification and visualization model performance and structure-activity landscapes
نویسندگان
چکیده
This study concerns large margin nearest neighbors classifier and its multi-metric extension as the efficient approaches for metric learning which aimed to learn an appropriate distance/similarity function for considered case studies. In recent years, many studies in data mining and pattern recognition have demonstrated that a learned metric can significantly improve the performance in classification, clustering and retrieval tasks. The paper describes application of the metric learning approach to in silico assessment of chemical liabilities. Chemical liabilities, such as adverse effects and toxicity, play a significant role in drug discovery process, in silico assessment of chemical liabilities is an important step aimed to reduce costs and animal testing by complementing or replacing in vitro and in vivo experiments. Here, to our knowledge for the first time, a distance-based metric learning procedures have been applied for in silico assessment of chemical liabilities, the impact of metric learning on structure-activity landscapes and predictive performance of developed models has been analyzed, the learned metric was used in support vector machines. The metric learning results have been illustrated using linear and non-linear data visualization techniques in order to indicate how the change of metrics affected nearest neighbors relations and descriptor space.
منابع مشابه
یادگیری نیمه نظارتی کرنل مرکب با استفاده از تکنیکهای یادگیری معیار فاصله
Distance metric has a key role in many machine learning and computer vision algorithms so that choosing an appropriate distance metric has a direct effect on the performance of such algorithms. Recently, distance metric learning using labeled data or other available supervisory information has become a very active research area in machine learning applications. Studies in this area have shown t...
متن کاملComposite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کاملAn Effective Approach for Robust Metric Learning in the Presence of Label Noise
Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...
متن کاملRobustified distance based fuzzy membership function for support vector machine classification
Fuzzification of support vector machine has been utilized to deal with outlier and noise problem. This importance is achieved, by the means of fuzzy membership function, which is generally built based on the distance of the points to the class centroid. The focus of this research is twofold. Firstly, by taking the advantage of robust statistics in the fuzzy SVM, more emphasis on reducing the im...
متن کاملThe social presence theory in distance education; the role of social presence in web-based educational environment
Dear editor-in-chief, The Future of Medical Education Journal The emergence of new theories of learning and education and a paradigm shift from being teacher-centered to student-centered alongside with advancement of novel communication technologies have set the grounds for modern human to use new methods of teaching-learning and free himself of the chains of time and place and keep on learning...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of computer-aided molecular design
دوره 28 2 شماره
صفحات -
تاریخ انتشار 2014